Adaptive Load Diffusion for Stream Joins

نویسندگان

  • Xiaohui Gu
  • Philip S. Yu
چکیده

Data stream processing has become increasingly important as many emerging applications call for sophisticated realtime processing over data streams, such as stock trading surveillance, network traffic monitoring, and sensor data analysis. Stream joins are among the most important stream processing operations, which can be used to detect linkages and correlations between different data streams. One major challenge in processing stream joins is to handle continuous, high-volume, and time-varying data streams under resource constraints. In this paper, we present a novel load diffusion system to enable scalable execution of resource-intensive stream joins using an ensemble of server hosts. The load diffusion is achieved by a simple correlation-aware stream partition algorithm. Different from previous work, the load diffusion system can (1) achieve fine-grained load sharing in the distributed stream processing system; and (2) produce exact query answers without missing any join results or generate duplicate join results. Our experimental results show that the load diffusion scheme can greatly improve the system throughput and achieve more balanced load distribution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of streamflow- suspended sediment load relationship by adaptive neuro-fuzzy and artificial neural network approaches (Case study: Dalaki River, Iran)

Modeling of stream flow–suspended sediment relationship is one of the most studied topics in hydrology due to itsessential application to water resources management. Recently, artificial intelligence has gained much popularity owing toits application in calibrating the nonlinear relationships inherent in the stream flow–suspended sediment relationship. Thisstudy made us of adaptive neuro-fuzzy ...

متن کامل

GreedyDual-Join: Locality-Aware Buffer Management for Approximate Join Processing Over Data Streams

We investigate adaptive buffer management techniques for approximate evaluation of sliding window joins over multiple data streams. In many applications, data stream processing systems have limited memory or have to deal with very high speed data streams. In both cases, computing the exact results of joins between these streams may not be feasible, mainly because the buffers used to compute the...

متن کامل

Runtime Optimization of Join Location in Parallel Data Management Systems

Applications running on parallel systems often need to join a streaming relation or a stored relation with data indexed in a parallel data storage system. Some applications also compute UDFs on the joined tuples. The join can be done at the data storage nodes, corresponding to reduce side joins, or by fetching data from the storage system to compute nodes, corresponding to map side join. Both m...

متن کامل

Adaptive Fault-Tolerance for Dynamic Resource Provisioning in Distributed Stream Processing Systems

A growing number of applications require continuous processing of high-throughput data streams, e.g., financial analysis, network traffic monitoring, or Big Data analytics for smart cities. Stream processing applications typically require specific quality-of-service levels to achieve their goals; yet, due to the high time-variability of stream characteristics, it is often inefficient to statica...

متن کامل

Adaptive Batching of Streams to Enhance Throughput and to Support Dynamic Load Balancing

As data permeates all disciplines, the role of big data becomes increasingly important. Sensors, IoT devices, social networks, and online transactions are all generating data that can be monitored constantly to enable a business to identify opportunity to enhance customer service and increase revenue. This need for real-time processing of big data has led to the development of frameworks for di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005